in 810 612 

williaas, RickL.: Mi<i. Others 
N&EP Year 11 Design Efficiency Study, final 
Beoort. 

Research Triangle Inst., Research Triangle Park, 
N.C. 

Education, COBBission of the States, Denver, Colo- 
National^AssessBent o^f Educational' Progress. ; 
National Center for E^ducation .Statistics (ED), 
Washington, D.C-: National Jnst. of Education (ED» , 
Washington, D.r. 
RTI-1969-01-01-F 
Jun 31 

0EC-0-7a-0506, 

NIE-G-90-0003 ' 
50p. 

MP01/PC02 Plus Postage- 
Educational AssessBent; Eleaentary Secondary 
Edi^fcation; Error of Heasureient; National Coapetency 
Tests; ♦Researct Design: *Researc;h Methodology;' 
♦Saapling; Testing Pro^rahs 
♦National Assessaent of Educational Progress 

ABSTRACT , " . 

^ • The National Assessient of Educational Progress 

ia-school sajnpllng design is a three-stage stratified design. 
Stratification variables include region, size of coaiunity and ' 
•socioeconoaic status. The three levels of sample Selection are 
Prinary Sampling Onits (t>SOs» , schools and students'. In general, two 
and soaetines three PSOs are selected froa each stratua for variance 
esti*aatlon. The stratification variables are as^aed fixed and not 
sablect to change: therefore, nhe proBleB of finding the -optiaal 
design is reduced to finding the nuabe.r'of PSOs, schools aad students 
per stratua that will ainiaize cost for a given variance. Following a 
brief OverView of the saaple dra«<.n for Year 11, presented in Section 
2,. the cost model developed for the purpose of " the %p re sent study is 
outlined in Section, 3, Section tt, ^scribes the statistics which were 
selected for analysis, and SectioiPs derives' the corresponding 
variance and covariance component models. Finally, Section 6 
describes the optimization procedut.e used, and Section 7 pcovid,es a 
suaaary of the results. ^Prlaary type of information V^ovi^ad-by 
report: Procedures (Saaplingt . (Author/BH» 




AOTHOR 
TITLE 

IHSTITOTIOH 
SPONS AGENCY 



FEPORT NO 
POB DATE 
CONT'^ACT 
GRANT . 
NOTE . 



EDPS PRIC« 
DESCRIPTORS 



IDENTIFIERS 



********* 

* Reproductions supplied by EDRS are the best that tan be made * 

* ■ froa the original docuaent. * 
«♦♦♦♦♦♦♦♦*♦♦♦♦♦♦♦♦♦♦♦♦•♦♦♦♦♦*♦♦•**************,*******♦*♦************♦** 

EBJC 



o 

o 
Q 



RTV1969/01-04 F 



Final Report 



NAEP Year 11 Design Efficiency Study 



by 



Rick L. Williams 
David V. Budescu 
James R. Chromy 



DCPAfmMCNT OF EDUCATION « 

NATIONAL INSTfTUTE OF EDUCATION 
EDUCATIONAL RESOURCES INFORMATION 

CENTER (ERICl 
j£ Ihm document Kas been reproduced as 
recetved from the perscMi or organizaiton 
Ongir^atirkQ it 

Mhiot cj>artiges have been made to improve 
reproduction qualrty * 



• PoVtJ 0* view or opmioos stated in this docu 
ment do not neces^anty represent offjctal NIE 
posjtion Of po»cy 



Sampling Research and Design Center 



Prepared fcor 
National Assessment of EducatiTonal Progress 
June 1981 



E S E A R C H T R I A r*^G L P MR K, NORTH CAROLINA 27709 



* -ii- 

* - \ 

TABLE OF CONTENTS 

■ - • « 

LIST OF TABLES A ' : ' ' : 

1. ■ INTRODUCTION^ . . . • 1 

2. -SAMPLE dVERVIEW " ..... U 

* • 

3. COST MODELS . . * 7 

'4. STATISTICS ANALYZED t 17 

5. VARIANCE AND COVARIANCE MODELS 21 

6. OPTIMIZATION PROBLEM ' 25 

7. NUMERICAL RESULTS 26 

8. CONCLUSIONS . 40 

REFERENCES •. Af. . .. 41 

APPENDIX A: COVARIANCE COMPONENTS FOR TWO ESTIMATED TOTALS. . 42 



1^ 



-iii- 



] , . / , LIST OF TABLES 

. Table ^ • , * - Pag^ 

3-1 Percentage of Time Allocated to Different Activities 

. by D(^s 'and EAs 8 

3-2- Variable Cost Components Suimnary: 1980 Levels .... 12 * 

* ** 

3- 3 Estimated Cost Parameters • 16 

4- 1 Within-P^kage Score Definitions" 19 • 

Cross-Package Score ^Jefinitions 20 . 

4-3 Analysis Population^ 20 

7-1- NAEP Design Optimization for Within-tackage Means. . . 30 

7-2 ^NAEP Design Optimization for Within-Package Means. . . 30 , 

7-3 Projected Percent RSE's for Wi^in-Package Means. 

Assuming 10 PgU' s/Str; f um, 2 Rep's/PSU and 15 Student/ 

Rep \^ : . . . 32 

7-4 Percent RSE Const ratinib for Within-gackage Means ... 32 

7-5 NAEP Design Optimization for Withi^p-Package Means/ . . 33 

7-6 NAEP Design Optimization for Within-Package Means. . 33 

*7-7 NAEP Design Optimization for Wi thin-Package Means. . . 34 \ 
. ' * ' 

7-8 NAEP Design Optimizations for Cross-Package Meanfe . , . 36 

f . ' ■ 

7-9 Projected Percent RSf's .for Cross-PaclT^ Means Assuming 

50 PSU^s, 2 Rep^s/PSU, and 15 Students/Rep ....... 37 ' 

7-10 Percent RSE Constraints for Cross-Package 'Means . ... 38 



4 



; . '1. INTRODUCTION 

Design efficiency studies are conducted to determine whether sampling 

'procedures have been - effective and what sample design pro^des minimum • 

♦ 

cost for a given variance*. The results from the design efficiency 
studies are used to plan future assessments. 

The Nationai-^Assessment m-school. sampling design is a three-stage 
stratified design. Stratification variables include region, size of 
community (SOC) and socioeconomic status (SES) . The three levels of 
sample selection are PSUs, schools and students. In general, two and 
sometimes three PSUs are selected from each stratum for variance estima- 
tion. The stratification variables afe assumed fixed and not subject to 
change; therefore, the problem of finding the optimal ^design is ^educed 
to finding the number of'^PSUs, schools and students per stratum that ^ 
will minimize cost for a given variance^ One of the objectives of the. 
design efficiency study' ha^ been to determine the "optimal" values of 
these parameters. y ^ 

If only ^one statistic is used, the solution for the optimal design 
IS well known (Kish, 1974). Jt is rare, however, in any survey to have 
oniy one statistic of interest. In the past, the best optimality criterion 
for many statistics -was not obviouS-. Some possibilities that have^been 
tortsidered are: 1) the design that has minimum avei^age variance at the 
given cost; 2) the design with, maximum a-verage efficiency relative to^ 
t^e separate statistic optima; 3) the design with minimum average lo^s 
(inverse efficiency); and 4^) the design that minimizes cost subject to 
variance Constraints for each of the separate statistics. \ 



J 



^ The aveH;-;je "^oi severa^ quantities is me^Aingful only if all the 

quantities ar^measured on, the .same scale with similar units. The 

variances of different statisti^cs would 'be measured on different scales; 

hence, the miq^^'jfum average variance does not> seem to be a meanifigful 

criterion. To r^void this problem, the efficiency for a particular 

statistic IS defmed. This efficiency is a ratio with numerator e^ual to ; 

the minitnum vari%ice that can be achieved by the gptimal design for that 
I 

sta^tistic and the denominator equal to the variance of the statistic for • 

t 

the given desigl-'l The problem now is what criterion can be formulated 
for optimality, Kased on the average efficiency. A , possibility is to 
find the design^^i^th large average efficiency at 'the given cost with 
small variance o^'ef f iciencies over all statistics f©r this, design. The 
trade-off betweeij'the maximum, mean and minimyfn variance of efficiencies^ 
is not easy to d^ine . This criterion was used in 'the Year 3 efficiency 
study (Shah, Folsom, Clayton, 1973). 

Kish (1974), has advocated inverting th^ efficiency to form what he 
calls the loss fojpction of a particular design. The advantage of Kish 5 
loss function ap|>roach is that a simple analytic 'solution exists for the 
minimum avferage loss where 'the averaging may be weighted. This is the 
optimel criterj-on used in the Year. 7 efficiency study (Sherdon, Folsom, 
Clemmer , 1977) . , , 

While* mi<niini^ing cost subject to ,a set of variance constraints for 
* key sta tis tics provides an appealing solution , efficient computational 
algoritLns for obtaining such solutiops have not generally been available. 
Such an algorithm* was recently derived, and softwarV\|or its implementa- 
tion was developed by Dr. James R. Chromy. This method will be used in 
the present study. 



EKLC 



ERIC 



-3- 



To determine optirJ^ designs estimates of the variance and. cost 
component associated with each stage of'sampling. and statistic of inter- 
est are necessary. FoUpwing a brief overview of the sample drawn for 
Year 11, which will be presented in Section 2, we outline the cost motlel 
developed for the purpose of the -present study in Section 3_. Section 4 
describes the statistics which were selected for analysis, and Secti'on 5 
derives the corresponding variance and covariance • component models. 
Finally, .Section '6 describes the optimization procedure u^ed, and Section 7 
provides a summary of the results. \ ^ 



\ 



2. SAMPLE OVERVIEW 

■ ■ '■ " - \ 

The Natiortal Assessment sampling designlis a three-stage stratified 
probability sample. Stratification variable^ include region, community 
size, and socioeconomic status. ' An overview of the general sampling and, 
weighting process is included^ here for completeness' and reference. 

The National Assessment sample^ is designed to be representative of 
students in three age "classes, 9-, 13-, and 17-year-olds, in all schools 
and communities *in the nation. It is also designed to product, for a 
variety of sutpopulations ,' performance estimates which are relatively 
unbiased and which meet certain precision requirements. 

Primary Sampling Units (PSUs) are geopraphic land areas consisting 
■ of a single county or several counties. Each year approximately 83 PSUs 
^ ' ' are randomly selected on a probability- basis \so that every county and ♦ 



♦every state in the United States has a positive chance of being included , 

I 

in the sample. 

At the second stage of sampling, a list of all schools, both public 
and private, within each of the >^ected PSUs is developed and a probabi- 
lity sample of these schools is selected for each of the three age 
classes. The number of schools selected in each PSU is determined by the 
approximate number of students in;*the eligible age group attending each 
school. Schools are selected in such a way that any given school will 
not appear in the sample more than^lrce in a four-year period. In most 
years, about 1,600 schools are selected; number selected in a parti- 

cular year depends upon- the number of distllnct packages.^ 

The third and final stage of sampling is the selection of a random 
sample of students from the eligible age group at each selected school. • 
A total of approximately 2,600 respondents is obtained for each National 



ERLC 



Assessment package^ Genelral^ly, the students are selected rrom. one to 
eight I schools ^within each selected* PSU for ea'ch of the three age, groups 
being assessed. | 

Selected students who do not show up for assessment are termed non- 
respondents. Response rates for 9-% and 13-year-olds tend to average 
about 85 percent, whereas -the response" rate for 17-yearrolds averages 75 
percent. Seventeen-year-olds who miss their appointments are followed up 
in school the day after the assessment. Seventeen-year-old dropouts ,and 
early graduates are located in their homes and administered .packages . 
According to census data, about 10 petcent of the 17-year-olds are not 
enrolled in school. Including these out-of-school individuals in the 
target population enables National- Ass^ps$ment to apply its results to 
the entire population of 17-year-olds^ rather than only to |^ose enrolled 
in school^ 'The assessment of dropouts and early graduates is termed the 
Supplementary Frame Assessment. 

Sample weights adjusted for nonresponse are computed for each age 
class. The weights are calcilted as the reciprocal of the appropriate 
selection- probabilfties . Sample weights ' are used to calculat^e ratio 
estimates of the proportions of pop,ulation members w4io^ respond in alterna 
ti\fe ways to assessment exercises, ^o that the proportion of population 
members .who respond in alternative ways can be calculated based op 
community location and occupation of parents, the assessment data are 
postclas^if ied into seven size and type of community (STOC) categories. 

Each year from 75,000 to 100,^00 persons are a^ssessed in one or . 
nvore learning areas normally taught in^ schools. In the past, NAEP has 
conducted major assessments in art, career and occupational development, 
citizenship, literature, mathematics, music, reading, science, social 



♦ 

studies, and wriXing, and six of these areas have already been reassessed 

Year 11 of the. project (1979-80) was the third assessmenjt in the areas 

, ' » i 

of reading and literature. 



J 




^ 3. COST MODELS 

Cost models are used in design efficiency studies to sho"W how total 
survey, costf would be affected by changing some of the design parameter^ 
Experience ^in^ny given survey at a fixed level of the design parameters * 
does Rot usually provide any directly useable data base for estimating 
the parameters of a useful cost model. ^ 

For NAEP, the budget prepared for any given year was allocated to the 

\ 

following five categofies: ^ 

1. Packages; ^ 

2. Travel points; 

3. Schools; , ' * ^ 

4. Administrative sessions; and ' j ^ ' 

5. Students. 

In* addition, a non-allocated or fixed cost component was identified. 

The cost model parameters estimates are only applicable within certain 
limits and assumptions based on current JJAEP operating practices, A few 
of ^he major assumptions are listed below: 

1. The current schedule of 13-year-old assessment in the rf a 11, 
9-vear-old assessment in Jaauary and February, and 17-year-old 
assessment in the spring is followed; 

2. ^ Any^ travel point in the sample for one age class affesessment is 

also included in the other two age class assessment at the 
same level (same number of replications of each package admini- 
stration); 

3. No more that ten group package sessions may normally be assigned 
to a single school; 

4. No more than 25 students (expected response*) may be assigned 
to a package session; and ^ 

5. Each package will consist of no more that 45 minutes of paced 
tape exercises . 4" 



•8- 



\ 



Any departure from these assumptions may require revision of cost . 
model, parameters . ^ ^ 
Allocation of 1978 Budgets . 

Administratfon. The major portion of the variable costs of anin- 

school , assessment is associated with the field work required to gain 
cooperation and collect- the NAEP package data. Furthermore, a large 
portion of this budget is as30ciated^ directly or indirectly with the 
off-si'te .staff of Dis^trict Supe-rvisors (DSs) and temporary Exercise 
Administrators (EAs). 

Table 3-1 shows the assumed allocation of DS and EA time. The 
allocatfions are based on: 

— ^ 1. The technical proposal for the period; 

2. The usual operating practice as defined in the Year 09 . DS ^ 
Manual; and ' 

I 

^ 3. ^Updated Distinct Supervisor (DS) expense reports. 

Five\ variable cost c&tegories (packages, travel points, schools, 
administrative sessions, and students) are identified in the overview/^ 
In analysing the budget, it was noted*that many cost items relate di^rectly ^ 
to the number pf District Supervisors required to conduct the assessment. 



Table 3-1 ^ 'Percentage of Time Allocated to Different 

Activities bV^Ss and EAs 



Activity 

Total Travel Point Schools 



Sessions 



Students 



DS 
EA 



100 
100 



22.8 
12.6 



56. -6- 
20.1 



'7.7 
37.5 



12.9 
2^.8 



A 



ERIC 



A^special ^category called "DS-related'* was used to - accumulate all s^ach 
costs. »The total in this category was then allocated to variable. cos t 
categories 'according to the percentage distribution shown in Tab^e 3-1. 

Costs not allocated to ^of l^^jjl^^^riable cost' categories belong 
in the i^ixed' cost or setup .category . The general approach to allocating 

• ' ■ •■ ' ' ' rx 

budgeted costs' is discuss'ed by category in the following paragraphs-. ^ 
Costs placed ' initiall^^ in the "DS-related" category iacluded: 




' (1) ' Fifty percent* of labor costs for the asspciated project director 
for adminflL'Stiyatiori, the^ field director ,* and the adrpi nistrati^e • 
seoret'arial support;, 

(*) ^ghty-five percent o^thlp labor costs iM: the re^lCfcAal super- 
'vo^ors, the admiciistrat.ive coordinator, and the survey assistant 

(3) All DS labor costs;- 

(4) Duplication of cassette group tapes; 

(5) j^Jielp wanted advertisifng costs; 

(6) DS relocation expenses-; 

(7) Supplies including ' ring-binders , jiffy bags, date books, and 
^ corrugated bo^es; , ^ 

(8) Shipl>ing and communication costs of DSs including mailings to , 
and from DSs, shipment of reports , ""and postal cards; ^ 

(9) Central staff travel to supervise DSs , to cojiduct quality 
.checks, to recruit new DSs, ^nd to t^ain DSs (excluding 
travel to the annual training session);^ 

• • . . 

(10) DS travel to training sessions and to debriefing sessions; and 

4l11) * Seventy-five percent of printing costs for DS manuals and 
demonstration packages. - , • 

Costs allocated to pack^es included: 

(1) Ninety percent of the lab6r costfe for th^ proofing and tape 
coordinatoi^; and 

(2) Production costs for uiagneJLic tapes* 
^Costs allocated to travel points included: 

(1) 22.8 percent of items initially allocated to DS-related expenses 



(2) .DS travel -to and from PSUs to' conduct assessment; 

'(3) D8 travel to and from, PSUs to hold • introductory meetings ;' and 

12.6 percent of £A services. 

Costs allocated to schools included: 

♦ 

(1) 56.6 percent of items' initially allocated to DS-related expenses 

(2) Fifty percent of computer pjrogrammer labor costs; ^ 

(3) Supplies expenses including envelopes, Mailing tapres^ mailing 
labels, stationery, portfolios, and SLF storage envelopes; 

(4) tomputer costs including usage chai*ges, magnetic tapes, and 
print ribbons; _ • 

(5) Shipping and communications costs for school mailings and fpr 
toll calls to schools; • • 

(6) Centr-al staff travel expenses for large city contacts; 

i 

"(1) DS travel within PSUs to conduct assessment; 

(8) DS travel within PSUs to conduct introducto^ meetings; 

(9) Seventy-five percent of^printing cos,ts for introductory mat\r- n 
ials , memoranda , schodl official questionnaires , and schQpl 
woAerJ; 

(10) . 20.5*percent of, EA services; and 

(11) jj^^^l expenses. 



V 

Cos^ts oca ted .to administrative sessions included: 

(1) 7.7 percent of DS-^related expenses; 

(2) Tape recorder"*repair expenses; 

♦ *■ 

(3) Depreciation of cassette recorders (approximated by twenty per- 
cent of cost budgeted for 1979); 

(4) • Seventy-five percent printing costs tor administration schedules 

EA manuals, and EA administrative instructions; and ^ 

(5) 37.5 percent of EA services. 

y 

Costs allocated to students included: 

(1) 12.9 percent of DS-related expenses; 

(2) Supplies expenses for pencils and art materials; 



■11- 



/ 

/ 



(3) Shipping costs for bus and freight shipment^; 

/ • * / 

(4) Excess baggage'^charges for DS travel; ^ ^ 

(5) , Seve'nty-five percent of printing costs for studeijt listing 

forms (SLFs) and parental permission forms; and ,1^ 

f (6) 29.8 percent of EA services. ' ^ ^ 

Based on Vhese allocations, the 1980 variable administration costs 
were determined as follows (numbers are rounded) : , . 

Variable cost associated** with * ' Amount 

• Ik , — f 

Package setup ' $1,200.00 ^ 

Travel points ^ 2,816.00 

Schools 292.00 

Administrative sessions (35 min.) 13.41 - 

Students . 1-57 Jf^ 

Sampling . San[^pling c,osts constitute a much smaller part- of the 

total budget. Variable sampling costs were associated with primary 

/ ^ 

sampling units and with schools.'' Budgets for 1980 were examined under 
various design configurations ai^tually considered prior to implementation. ^ 
Results of sampling cost component estimation for the Year 07 design 
•efficiency studies (Sherdon, Folsom, Clemmer, 1977) were also examined. 
Based on these considerations, the following variable costs at 1980 
levels were developed: 

. Task . PSU component School component 

Select school sample $140.00 ^11.64 

Package assignment 43.05 

Weights' . 9.31 

Total .--I , 140.00 . 64.00 



1-3 



e 

Print and Scorinjg . Only approximate variable costs for the printing 

J 

and scoring comppnents are included in the model: f-or package setup, 
$10,000 and for ^st^dents, $1.23. SQoring costs can vary Videly 
depending on th^ mix* of hand scoring* and direct optical scanning. 
Variable cost components are summarized in Table 3-2. 

Table 3-2 Variat)le- Cost Component^ Summary : 1980 Leve*Ls- 



Name"*"/^ 




Symbol 


Amount 


Fixed ^ ' 




c , 

0 


$271,000.00 


Package setujJ * 




C 

P 


1,200.00 


Travel points ^ 






3,056.00 


Schools ^ • '"t^ ^ 




c 

s 


356.00 


Adhiinist rative^ sessions 




c 

a 


13.41 


Students (edit & score)i 


* 


C 

e 


^ 2.82 



Relation of cost and variance models . In order to seek optimum' 

design'^configurations , cost and variance md*dels must be stated in terms 

of the same diesign parameters. Basic design parameters include the 

number of primary sampling units (PSUs) , the number of replicates per 

PSU, ^nd the- number of students sampled per replicate. The term "replicate 

• ' r 

is used to denote the number of group administrations planned for eaxrh 

group package with a PSU. Since alj packages cannot be administered in 

each sample school, the number of schools usually exceeds the number of 

replicates. 



• « 

The cost funct^ion ha^ been found to be very sensitive to the number 

of schools. The variance funtion (f-or- group packages) is directly* 

♦ 

affected -only by the number of replicates. The number of school^ which 
must be selected, on th^ average, in 6ach repliicate depends on a number ^ 
of factors : . 

(1) The number and type of packages assigned to each age class; 
%L) The -number of students selected per package; 

(3) The distribution of school sizes in terms of age eligibles;, 
and 

(4) -Any administrative restrictions employed to limit the burden 
placed on individual schools. 

'A number of models relating the number of schools per replicate to 
package configurations and sample sizes were studied using .Year 01 ^ 
through 09 data!* Tt was noted that the number of schools per replicate 
should be at least one even if only one package was. used. The following 
model w^s selected for its fit to the data and for its intui t ive" appeal : 

' , s = Max[l,b G ] • (3.1) 

, . a Ida 

where 

s = the average number of schools per replicrfte for age class-a, 
/and 

4) = the number of group packages assigned for age class-a. 
a 

The value b, was obtained for each age class by ordinary least squares 
1 a 

fitting of the model J 

s = b, G • ■ . ' (3.2) 

a la a . ^ 

to. the Year 01 through 09 data. The model ignores th^, sample size for 
group package sessions; this assumption would be unrealistic if the 
group session size could vary without limit. In practice, scheduled 
sessions for more than 25 students have been difficult to manage. Allowing 



-14- 



for nonresponse and some variability in assigned .sample si-zes to achieve 
weight stability, an average achieved sample size of 16 ^students per 
group session la considered near the feasible maximmn. Year 01 through 
06 -assessments were targeted for 12 respondents per group package 
session; Year 07 through 09 assessments* were targeted for 16 respQndents 
per group package session. 

' Since the clustering e-ffect ^f6r individual packages is less than 

; 

that for group packages under the sample allocation schemes normally 
employed, the sample size per replicate for individual packages should 
be less t\\an that for group packages under equivalent precision require- 
ments . ^ * 

EstifiraCed values for b, based on ordinary least squares fits wei/e 

la / 

as follows (standard errors a.re indicated in parenthesis): 



Age class 
9 (a=l) 
• 13 (a=2) 
17 (a=3) 



la 



.41© -(.030) 
.312 (.024) 
^jAi (.013) 



Cost model parameters . A niunber of cost models suitable for studying 
alternative design configurations can be^ developed from the data in 
Table 3-2, If the assumptions outlined above hold, a cost model can be 
stated in terms of the number of PSUs , replicates, and students sampled 
per package as follows: 

^ , " 3 ■ ■ 

C = * "iS * "l"2S^"lV3 ^ ^aSc 

a=l a 



^0 * "l^/* "1^2^ ^"lV3S 



(3.3) 



wher^ 



totals , , 

= fixed cost component (may be a function of number of packages); 
C cost associated with adding one P§U; 



= costf Associated with adding one replicate; 
C^Q = cost , associated with adding one respondent to a group ^ession 



d at age class a ; j ^ 

C = I G fc,, 
3 , a 3Ga \ ' - ^ ' 

> 

n^ = number of P^Us; 

n^ = number of replicates per PSU; 

n^ = number of student respondents per* repli^:ate per group package. # 

The value of C - is stated 'in ^able 3-2 directly; and C^^ ran hje 

a 

determined from the values in Table 3-2 and certain assumed relationships 

of cost model and variance model parameters. 

The value of C^, costs associated with adding one replicate, .can be 

stated as functions of the school cost component, ; the estimated 

regression * parameter^ b , relating numbers of sclpols to numbers of 

la • 

packages; the number of group packages, G , fo*r each age group a; and 

a 

the administration session cost, . Symbolically, Cy^an be expressed 
as 

/ 3 3 ^ 

C^=C IG-t-C lb, G ' (3.4) 

s _ 2 a , a s _ la a. 
a=l a=l 

The student cost component, C^q , for group sessions can be expressed 

in terms of student ^ editing and scoring costs (C^V Assuming cost 

structures are approximately the same for all three ages, 

. C..^=C ' ; (3.5) 

3G e . # 



finally this yields 



C = $13.4l(Gi + G2.+ G3) 

^ + $356. 00[. 4190^ + .31202 + .212G2]. ^ (3.6) 

C^g =N$2.82 (a=l,2,3) • ' . (3.7) 

C3 = $2.82 (G^ + G2 + G3I ^ . (3.8) 



Three cost models were entertained. The first one used assumed a 
^^O^packSge assessment with 11, 15,, and 14 group packaged for the three 
ordered age groups, -respectively. This is the assignment used in Ye^r 
11." The two other cost models assiimed 3 packages per age group and 6 
packages per age, respectively. The estimated cost parameters f or^hese 
three models are> shown in Table 3-3. 



Table 3-3. Estimated Cost Parameters* 

} 



Number of Packages 



'1 



11 for age 9 




/, 








15 for age 13 


- 30S6.00 


■ 4899 


29 


112 


80 


14 for age 17 












6 per age group 


3056 . 00 


2255 


63 


, , . 50 


76 


3 per age group 


3056.00 


1188 


64 


25 


38 



STATISTICS ANALYZED 



It was emphasized in the introduction that the present study will 
seek to .optimize the design simultaneously for several ^statistics. 
Specifically, 58 items, intended to measure two NAEP subobjectives , were 

9 

selected. From these items,, 21- linear combinations were defined for 

analysis. All the items alte pertinent to objective IV of the 1979-80 

Assessment--"Application of study skills in reading." The following two 

subobj ectives *were specifically addressed: 

A. "Obtains information from nonprose reading facf litators" 

-B. "Obtains information from materials commonly found in libraries 
or resource centers." 

Subobjective (a) attempts to evalua-te whether the students 'use visual 

aids when reading and whether they can corr^^l^^^^nterpret information 

given in charts, maps and graphs. The second subobjective is directed 

at measuring the extent to which students use various reference materials 

and, whether they Can find specific information in these^ materials (e.g.,^ 

dictionaries, eticyclopedias , etc). All the items^ selected for analysis 

are multiple Choice in' format with *a single correct res^^ronse. 

The scores analyzed in this study fall into two categories--within- 

package scores and cross^package scores. Within-package scores are those 

defined frotn items taken entirely from a single package. Conversely, 

cross-package scored involve items taken from multiple package^sj^ 

^ The 15 within-package scores shown in Table 4-1 were considered. 

For eacfl student taking one of the indicated packages,* the score was 

defined to be^the proportion of items involved that the student answer.ed 

correctly. The st'atislics of ultimate interest were the means of the 

scores over students or, in otj^er words,' the mean, proportions answered 

correctly. ^ 



-18- 



Cross-package scores were constructed by averaging related within- 
package estim4tes. That is, if R^jR^,.. are related wi thm-package 



means', then their associated 'cifoss-package mean is 



(4.1) 



A cross-package mean was defined- -for both subobjectives within each age 
group. These are presenterd in Table 4-2. 

Fiiwlly analyses were conducted for th6 six populations listed m 
Table 4-3. 



ERIC 



V 



/ 



• 


, Table 4-1. 


Within-Package 

.V 


Score Definition^ 

r 

• 


9 


Within-Package 










Associated 


* Score Number 


Age 


Package 




Item Numbers 


Subobjectiyes^' 


1 


r 


4 




f 

7a-7d 


b 


{ 2 


5 




8a-8d 


b 




9 


5 


/ 


9a-9c 


a 




• . 9 


8 ' 


-5a-^d,'9a-9d 


a 


I > 5 


13 


2 




15a, 15b 


a 


6 « 


■ 13 


4 




7a-7d 


b 


7 


13 


6 




7a-7c . ■ 


a 


8 


13 1 


6 




9a-9e , 


b 


9 


13 


8 




9a-9d r ; 


a 


10 


17 


. 1 




7a, 7b \ •' 


' a 


n . 


17 


1 • 




lOa-lOc 


s a 


12 


17 


4 




7a-7d 


* b 


13 - 


17 


6 ■ 




9a-9c 


b 


14 


17 
17* 


13 
13 




8a-8d ' 
•9a-9c 


b 
a 


-C 












/ 









f 





subobjectives : - \ . • 

a. Obtains information from nonprose readipg facilitators. 

b. Obtains information from materials commonly found in libraries or 
resource centers. 



\ 



O ) 



Tatle 4-2. Cross-Package Score Definitions 



* Defining Withiti-Package 

Score Numbers From Table 4-1 
Subob^jectives 

Age ^ - • (a)* (b)^-^ 



9 3,4 ' 1,2 

13 i 5,7,9 ^ 6,8 

17 . 10,11,15 12,1-3,14 



■^Obtains information from nonprose reading facilitators. 

» * 

**Obtains information from "^materials commonly found, in libraries or 
resource centers. $ 



Table' 4-3. Analysis Populations 



All students (National) 
^ Non-whites , 

, Males « 

Vemales *^ 
5[tudents with» parents *^ educatfTJn Tess than high school 
Students with parents* education at least high School 



f 

5. VARIANCE AND COVARIANCE. MODELS 



Variance models are developed to demonstrate how the precision of 
the estimates would be affected by changes in the design parameters. In 
this section two general variance models are derived which can be applied 
to ( 1 ) scores drawn from a single package and (2) mean scores combined^ 
across ^ackages. ^ 

For Year 11 assume a with rejriacement three stage design of , 

1) J^SUs , , - - ^ • 

2) Schools ' " _ " ' 

3) Students * . ^ * 

and ttTaT^ the sample is selected with probabili^es proportional to size 
(PPS) at the first two stages and with equal probabilities at the -last 
stage. Let 



^gijk ~ ^^^P°^^^ student-k from school-j from PSU-i for package 
n^ = number of' sample PSUs 
n^ *= number of replicates per PSU- 

n^ = number of sample students pe^r school ' 

G = number of packages of interest^ 

N = number of PSUs in population 

N = number of schools in -PSU-i population 

N. . = number of age eligible students in school-ij 

A. . = size measure for school-ij 

P. = A. /A = single draw probability for PSU-i 
11+ # *^ 

P.^.s* = A.. /A., = single draw probability for scho&l-ij given PSU-i 

J (l ) IJ 1+ * T . O 

^ ^ selected. 

♦ 

The -estimate of a population total, say Y, from a single package is^ 
^1 '^2 ""3 



Now, consider a separate ratio estimator of a crbss-package mean , 
score for items taken from G distinct packages. That is, 



R ^ lR^+Rj+...+Rgl/G ' (5.2) 



where 




R = y /x 
8 8 8 

= the within package mean score for package-g 

y = the estimated total for the scpre from the ^ackage-g sample 
/ (g=l,2,..., G) , 

and 

<# 

X = the total number students estimated from the package-g sample 
^ (g = 1,2,, ..,G). , ' ' . 

Xhe variance of R is 
/ • 

^ G X U" 1 G ^ ^ ^ 

V(R) = ( I VCBv) +21 I Cov(R ,R ,)]/G^ , ' (5.3) 

g=l JK g=l g'=g+l \^ ^ • ; 

It is conmra/fly known^and has been used in previous NAEP efficiency 

studies that the V(R ) can be decomposed into 

g ^ f 



,V(R ) = o^(l)/n + aj(2)/n.n- a^(3)n n n - - ■(5.4)' 

gglgl2gW3 \ 

where . ,• , ^ 

2 

a (1) = the between PSU contributibnvto variance, " 
g . 
2 

* a (2) = the between school within PSU contribution to variance, 

2 ' • 

,0 {3) = the between student within school contribution to variance. 

Software is available at RTI tL estimate these* three components (Shah, 1979) 

Model (5-4) is. exactly that required for scores drawn from a single package. 

Now,, the Taylor Series approximati6n to the Cov(R ,R , ) is ^ 

Cov(R^.R^.) =Cov(y2^.y^./x^.) \- ■ , 

■ =T^;[CoHv^^y^.) -RgCoy(y^..i^) 
. g» 8 . • 

- Rg.Cov(yg,x^,) + RgRg, Cov(Xg,Xg,)] 



Cqv(z ^ (5.5) 

jrfhere 'X and R .are the corresponding population values of x and R , 

respectively. The form of the covariance term is explicitly derived in 

Appendix A. It is shown that 

Cov(R ,R ,) = Cov(z ,z J ^ " ^ 

8 8 8 8 

= ^ggtd)/^! + tag^(2)/n^n2 (5.6) 

where a ,(1) and O ,(2) are ^^omponents of covariance foT the first and 
88 88 

second stages, respectively, analogous to the variance components iti (5.4). 
Also, t is the proportion of schools w^ere both package-g and -g* arg 



administered. ' ^ 

SJext, combining (5.3), (5.4) and (5.6) » 



V(R) tag(l)/a^ + a?(2)/n^n2* + a^(3)/n^n2n3 

* 

G-l G 

+2 1 I [o (l)/n + ta ,(2)/n n ] 
, - g=l g'=g+l ^« I gg 1 z 

= g"^ {o^(i)/nj + 0^(2)ll + 2tp]/njn2 + 0^(3)70^0203} (5.7) 

wher^. — ^ 



G G " 1 G 

0^(1),= I of(l) +.2' I 1 O (1) (5.8) 

g=r ! g=i g'=g+i 

I 

a^i2) = I o^(2) (5.9)' 



P 



' G-l . G 2 



[ I. I O ,(2)]/o^(2) (5.16) 
g=l g'=g+l »« 



o'^(;Q) = I 0^(3) . - , . . (5.11) 

8=1 ^ ' ■ * ^ 




■24- 



The above model foK^he variance of ,R, equation (5.7), contains a 
K parameter (t) for the projnfrtton of s where a given pair or 

packages were jointly administered. This is a function of the number of 
schools per replicate since every package is administfllred , within a 
. replicate. Chromy, Cleni|er, and Jones (1980) have shown that the number 
of schools per replicate can be approximated by a function of the total 
number of packages. This implies that t probably does not vary with the 

J Sample design parameters (i.e., n^, n^, and n^). Thus the value of t 
observed in the data ' will be substituted into the model. 

The model outlined ab<yve was developed at the stratum level. 
However, in determining the optimal allocation, the population will be 
stratified in eight strata defined by the cross-classification of the 
geographical region (West, Central, Northeast, and Southeast), and the 
community size ^ (rural = no place with 25,000 or more population in 1970 
Census, Urban = otherwise). Variance and covarianoe components will be 
estimated for, each of the strata. 

Finally, variance and covariance components will/be estimated for 
each of six subpopulations (domains) judged to be of interest: National, 
Non-white\ males. Females, students with parents' education less than 
high' school,, and students with parents* education at least at the high 
school level. 



\ 



23 



-25- 



) 



ERIC 



6. OPTIMIZATION PROBLEM 



Thus far, a linear cost* model has been developed of the general 



form 



( 



H 

C = I C(h)x(h) ^ (6.1) 

h=l 

where C(h) is the cost of adding an additional Init to the h^^ stage of 
the sample and xCh) is the sample size for stage-h. In addition, a 
variance model has also been developed of the form 

« 

V(k) = I V(kh)/x(h) ' . (6.2) 

h=l 

where V(kh) is the component of variance associated with the h^^ stage 
of sampling for statistic-k and x(h) is as before. 

Combining these two' models, the probleft at hand is to find the 
values of x(h) (h=l,2,..., H) which minimize C subject to: 

r 

(a) V(k) ^ V*(k) (k=l,2,. .., K) 

and 

(b) x(h) ^ 0 (h=l,2,..., H) 

where V*(h) is a positive constraint on the variance of statistic-k. 

Several approximate solution methods for this problem are described 
by Cochran (1977). In addition^ numerical solution methods have been 
given by Hartley and Hocking' ( 1963) . Chatterjee (1966) Zukhovitsky and 
Adeyeva (1966), and^ Huddleston ^ al (1970). The solution method used 
in this report is a numerical solution developed by Chromy (1970).^ The 
algorithm ./is written in BASIC and operates interactively on the HP2000 
computer. 

* 4 



-26- 

* 

r 7. NUMERICAL RESULTS 

r 

^ Estimates of' the variance model parameters were obtained usiag data 
from tfie NAEf Year 11 Assessment. For package level statistics, separate 
components were e^fcmated for each of the eight strata. This led t© a 
stratified variance model which allows f^y^i^ separate allocation of 
resources to each stratum. The data' could not support the estimation of 
stratified models for the cross-package means. For these statistics, a 
single set of national level components were estimated. 

The ^remainder of this section will be broken into two parts--one 
for viithin-package statistics and the other for cross-package istatistics. 



Wi thin-Package Analyses . 

Variance models were estimated for each of the six domain-s in Table 
4-3^ for the 15 scores' in Table 4-1. This yielded 90 estimated models 
eac^k With 24 compan^nts (3 levels by 8 strata). 

. The variance components in these models are estimated from the NAEP 
sample dat^and, hence, are subject to sampling variation. In an attempt 
to smooth out this variability, groups of components which were expected ^ 
to be of comparable size were identified and smoothed estinvates obtained. 
The components were, grouped by age, stratum and domain. Within an 

agef-stratum-domain group let o.(j) be the variance component for the i, 

- ' / ' ^ ^ 

score for the j^^ level of the design (j = 1^2,3 for PSUs , replicates and 

students, respectively) andlgX n be the number of scores in the group.. 
Define / . ^ 

= a^(j)/aj(4') > ' (7.1) ^ 



where 



^ o2(+) = 1 oJ(j).' (7.2) 



Also define 

I 6.(j)'/n ('7.3) 



and 



oJ(+) = I o^(+)/n (7.4) 
i 

Finally, the smoothed comppnents were calculated as 

. . ^ o^ii) = 6^.(j)o^+-) ■ • ^ ' (7.5) 

Withia an age-stra^um-domain group, this process estimates an average 

total variation (o (+)) and an average proportion of variation atUribu- 
^ ♦ * 

table to "'the j stage of sampling (6^(j)). The product of these two 
estimates estimates the stage^j variance component tor the. group. This , 
approach was taken, rather than directly averaging the components, since 
the total variation and the proportions of variation yere expected to 
exhibit greater stability. 

The smoothing process resulted in a separate variance model for 
each domain within each age group for a total of 18 estimated variance 
'models each with 24 levels (8 strata by 3 sampling stages). 

Once all the models have been parameterized, it becomes necessary 
to select appropriate variance constraints. In this situation it is* 
often convenient jto work in terms of relative variances and standard 
lerrors. To see this, recall 'that a 95 percent normal theory confidence 
interval (C.I.) for a mean Y is approximately 

C.I. = Y ± 2SE(Y) • (7.6) 

where SE(Y) is the stQ<U^ error of the mean. A common' precision 
constraint is to require that the hait^ width of the confidence \nterval 



31 



•1 



-28- 



be less than some multiple of -^he mean, say CfY. Thus, it is required 

. ^' 

that , ' 

C.I. = Y ± aY. (7.7) 

This implies that 

crY = 2SE(Y)* " ' ' , (7.8) 

or 

Cf = 2SE(?)/Y . . (7.9) 

= 2RSE(Y). 

where RSE(Y) is the relative standard error of the mean. Hence, the 
expected half-width of a 95 percent confidence interval relative to the 
mean \s twice the relative standard error of the mean. 

To take advantage of the relationship IVetween^^the relative standard 
error (RSE) and the expected width of a confidence interval in setting 
variaace constraints, thfe general variance model shown in (6.2) can be 
recast by dividing through by the squared mean to yield 



H ' ■ 

V(k)/Y^(k) = I V(kh)/Y^(k)x(h) . (7.10) 

h=l 



or 



/ 

and 



H . , • . 

RV(k) = I RV(kh)/x(h) (k=l,i,..., K) (7.11) 

. h=l 



RV(k) = V(k)/Y^(k) , ' (7.12) 

= the relative varian/e of . st^ti stic-k , 

RV(kh) = y(kh)/Y^(k) . (7.13) 

= the relative variance component for stratum-h 
statistic-k. 



ERIC 



Thus, the constraints take the form 

RV(k) S [RSE"-'(k)]^ (7.14) 



or 



H * \ ' '4 

I RV(kh)/xClKt^U^^^(k)]^ (k=l,2,..., K) • (7.15) 
, h=l 

where RSE"(k) is the relative standard error constraint on statistic-k. 
This transformation was applied to all the variance ^models in this 
s tudy : ^ 

Optimal NAEP sample designs were obtained assuming the cost model 
shown ^in Table 3-3 with package assignments of 11, 15; and* 14 to^ the 
three ordered age -groups (the Year 1*1 cost model) and assuming the 18 
smoothed w;thin-package variance models. Table 7-1 presents the optimal 
design under global 10 percent relative standard error constraints. 
This table can be contrasted* with Table 7-2 which presents the optimal 
desigo for five percent RSE constraintsy^ The main body of each table 
presents, for each stratum, the number of PSU's, replicates (Rep's) -per 
PSU /nd s,tudents per replicate Note the substantial difference in the 
resources required for these two 'designs induced by the change in the 
constraints. r 

In an attempt to identify reasonable precision constiraints , the 
percent RSE*s for within-package means projected by the variance models 
are presented in Table 7-3 for a hypothetical sample of 10. PSU s per 
stratum, two replicates per PSU and 15 students per replicate. This 
implies a total sample size of 2,400 students per ^package. Tabl^ 7-3 
presents the best precision possible under the above ^hypothetical design. 
Notice,, that allj of th^ projected RSE's are less\han the 10 percent 
constraints used to prepare Table 7-1', while several are greater thah 



-30- 



Table 7-1. NAEP Design Optimization for Withln-Package Means 





t 




Urban 






Rural 






Region 


PSU's 


Rep' s 


Students 


PS.U's 


Rep's 


^Students 


'V. 


NE r 
S 

NC 
W 


2.54- 
1.45 
4.67 
3.91 


0.61 
2.23 
2.18 
0.86 


127,54 , 
53.78 
22.48 
82.58 


\ 

0.78 
• 3.15 
2.05 
2.21 


1.37 
1.85 
0.66 
1.74 


52.03. ' . 
38.46 
55.57 
28.69 


\ ■ 


/ ' 


20.76 
30.*41 


Total PSU's 
Total Rep' s 











1,343 Total Stu|ents/Package 
f363,987 Variable Cost 



Notes*: , 
Year 11 cost model • • 
10% RSE constraints 



Table 7-2. NAEP Design Optimization for Within-Package Means 







* Urban 






Ryral 




Region 


PSU's- 


Rep ' s 


Students 


PSU's 


Rep's 


Students' 


NE 
S 

NC 
W 


10.17 
5.81 
18.70 
15.63 


0.61 
2.23 
2.18 
0.86 


127.162 
53.82 
22.47 
82.52 


3.11 
12.61 
8.20 
8.85 


1.38 
1.85 
0.66 
1.73 


51.91 
38.44 
55.46 
28.71 




83.0^ 


Total PSU's 












•m.68 


Total Rep's 












.5,371 


Total Situdents/Package . 









• $1,455,950 Variable Cost 

Note^: 
* Ywr 11 cost model 
5% RSE constraints 



ERIC 31 



:31- 



the Table 7-2 five percent RSE cohstraints. Review of Table 7-3 lead to 
the formation of the specially selected set of RSE coitstrints exhibited 
in Table 7-4. 

^ A third optimization for within-|^kage statistics is reported on 

in Tablet 7-5. This design was derived from the Year 11 lost model and ' 

the selected precision constraints 'in Table 7-4. Notice that this ^ 

design is very ^similar in the total numbers of PSU's, replicates, and 

students to the hypothetical sample design used to' generate the 

precision constraints in Table 7^4. On the other hand, the optimal 

design differs markedly in its allocation of resources to the various 

strata and stages from that of the hypothetical design. The des'ign ^ 

indicated m Table 7-5 demonstrates how the available resources should ^ 

be allocated to minimize the cost of the sjurvey while still meeting the 

designated requirements. 

Two additional optimations vere calculated and are presented in 

Tables 7-6 and 7-7. Both of these optimizations were co*hstrainted as 

shown in 'Table 7-4. The difference between them is that the former 
# 

assomed the six packages per age group cost model and the latter the 
three package per age group cost ^ode],s (see Table 3-3). Comparison of 
Tables 7-5, 7-6, and 7-7 indicates that reducing the number of package 
reduces the CQSt -of the optional sample design. In addition, most of the 
cost saving comes about through a reduction in the number of PSU's with 
t:he sample sizes of the other' two stages increasing. The reason for 
this becomes readily apparent whed the three cost models in Table 3-3 
^are*icop!pared . Notice that reducing the niiJnber of pacl^ages per age group 
leaves the cost per ^SU unchanged while substantially reducing the cost 
per replicate and per student. Hence, reducing the number of packages 



ERLC 



35 



r.32- 



Table 7-3. \ Projected Percent RSE's' for Wi thin-Package 

Means Assiiming 10 PSU' s/Stratum, 2 Rep's/PSU 
and 15 Students/Rep. 



Domain 




Age 9 


Age 13 


Age 17 

L . 


National 




2.13 


2.42 • 


• 1.36 


Non-White . 




5.42 


5.04 ■ 


3.^67 


Male 




2.99 . 


2.96 - 


1.75* 


Female 




2.41 


2.77 


1.58 


Parents Education 
< High School 




7.03 


6.10 


3.33 


Parents Education 
I High School 


✓ 


2.27', 


2.20 


■ 1.28 


• 

Table 7-4. 


Percent RSE Constraints 


t 

for Wi thin-Pac4iage Means 










. / 


Domain 




Age 9 


Age 13 ^ 


Age 17^^ 


National 




2.00 


2.50 ^ 


2.00 


Non-White 




5.50 


5.00 


4.00 


Male 




3.00 


3.00 


2.00 


Female 

ft 




2.50 - ■ 


* 2.50 ' ^ 


2.00- 


Parents Education s 
< High School 




7.00 ■ 


6.00 


3.50 


Parents Education 
^ High School 




^.25 ; 


2.00 


2.00 



-3-5- 

V 



Table 7-5. NAEP Design Optimization for Within-Package Means 



Urban 



Rural 



Region PSU's 



Rep's 



Students 



PSU^s 



Rep's 



Students 



NE 


' 1 

17 


r 

70 


1 


30 


23 


72 


3. #9 


0 


6A 


A5 


53 


S 


10 


87 


1 


36 


26 


07 


^•7/7 


1 


.35- . 


28 


A3 


NC 


9 


92 


2 


56 ' 


21 


75 


• 2.8A 


3 


89 


18 


72 


• W 


19 


37 


1 


29 


25 


75 


2.11 


5 


.23 


23 


A2 



























73.77 Total PSU's 

122:34 Total Rep's 

2,977 TotH^tudents/Package 

$1,160,640 Variable Cost 

Notes : 

Year 11 cost* model 
Table 7-4 constraint set 



Table 7-6. NAEP Design Optimization for Within-Package Means 



Urban 



Rural 



Region PSU's 



Rep's 



Students 



PSU's 



Rep' s 



Students 



NE 


lA 


51 


>.6A 


25 


50 


2.70 


^0 


99 


A6 


A5 


S 


7 


70 


2.05 


27 


53 


6. OA 


■ 1 


57 


31 


31 


NC 


7 


51 


3:62 


23 


09 


2. 19 


6 


00 


. 17 


70 


W 


16 


97 


l.AO 


29 


05 


1.21 


10 


50 


23 


A5 



58.83 Total PSU's - 

128.59 Total Rep's 

3,312 Total Students/Package 

$637^925 Variable Cost 



Notes: 



Cost taodel assuming, 6 packages/age group 
Table 7-4 constraint set 



T 



ERIC 



ERIC 



^34- 



Table ^^-7. NAEP Design Optimization for Within-Packa^e Means 



Urban Rural 



Region fSlTs ' Rep*s Student^Sw PSU's Rep's Students 





11 


94 


2.07 • 


• 27 


85 


)i 


2 


05 


1 


59 


42.63 


S 


5 


89 


2>%6 


29 


19 




5 


17 


2 


11 


31.02 


NC 


5 


74 


5.15 


24 


10 




1 


78 


8 


33 


17.67 


W 


14 


59 


1.69 


31 


68 




1 


11 


■12 


53 


24.70 



• 48.27 Total PSU's 
138.70 Total Rep's 
3,758 Total Students/Package 
$407,736 Variable Cost 

Notes : 

Cost model assuming 3 packages/age group 
Table 7-4 constraint set 



-35- 



makes it less expensive to- select more replicates, more students, aji^ 
fewer PSU*s to meet the same variance constraints. 
Cross-Package Analyses 

As was noted previously, it was not possible to estimate all of the 
parameters in the fully stratified variance model for cross-package 
means^l^For this reason, a non-stratified national three stage variance 
model will be used here. The three stages correspond to PSU's, 
rep-licates within PSU*s and students within replitfates. jks was done for 
the within-package analyses, the variance models were placed on a 
relative scale by dividing through by the squared mean. 

All of the design 'optimizations for cross-package statistics are 
presented in Table 7-8. The sequence of optimizations proceeds similarly 
to the within-package analyses. First, two design optimizations were 
performed assuming the Year 11 cost model (see Table. 3-3) and either 
global ten percent relative standard error (RSE) constraints or five 
percent RSE constraints. These two constraint sets led to designs trfat 
differed marked in cost \nd number of PSU's. However, the number of 
replicates per PSU and students per replicate were left unchanged. 

The shape of the constraint space was explored by considering the 

projected RSE's in Table 7-9 for a hypotljetical sample design consisting 

of 50 PSU's, 2 replicates per PSU, and 15 students per replicate. This 
* * \ 

V 

table indicates that the ten percent RSE constraints were gen^Vally^too 
loose, -while the five percent RSE constraints were often too tight> 
This led to the constraint set presented in Table* 7-10. 

The remaining three optimizations' presented in - Table 7-8 were 
calculated using the constraints in Table 7-10. These thlree designs 
were derived from the three cost modefs in' Table 3-3. Surprisingly, 



-36- 



Table 7-8; NAEP Design Optimizations for Cross-Package Means 



' • Total . 

Cost Model & ^ Rep*s/ Students/ Total Students/ Variable 

Constraints PSU's PSU Rei/\ ^ Rep*s Package Cost 



Year 11 cost model 



10%RSE's^ 26.67 4.03' 9.84 107.47 l,05fi $727,-364 

.5%RSE's 106.69 4.03 9.84 429.88 4,231 2,909,460 
fable 7-9 

Constraints 48.39 2.37 16.02 - 114.90 1,841 918,514 

Six packages/age group cosl^ model 

Table 7-9 / 
Constraints 48.19 2.37 16.47 114.30 1,883 500,682 

Three packages/age group cost model . . ' , 

.Table'' - ^ • 

Constraints "47.50 2.37 18.19 ' 112.46' 2,045 330,759 

; • ; . 

7 — ' — 7' : 



4 'J 



Table 7-9^. Projected Percent RSE*s for Cross-pJ^kage Means 

Assuming 50 PSU*s/2 Rep's/PSU, an^ 15 Students/Rep, 



Domain ( 



Age 9 



SMbobjective ^ . 
a b 




Age 17 



Subobjectlve 
a b 



Subobjective 
•a • b 



National 
Nop^ite 
Male 
Female 



3.,1A 
10.07 
1.81 
A. 30 



Parents Education 7.50 
< High Schrfc^. 

intents Education 3.73 
S High School 



2.81 
7.58 
7.06 
2.32 
A.A9 

2..A1 



2.A5 
6^3 
2.92 
2.38 
A. 35 

1.96 



1.36 
3.05 
2.51 
1.29 
3.31 

-1.22 



1.2A 
2.88 
^.68 
1.32 
2.63 



i.oa, 

2.11 
1.39 
I.IA 
2.27 



1.08 '0.«98 



Table 7-10. Percent RSE Constraints for Cross-Package Means 



> 

Domain 


Age 


9 


Age 


13 


Age 


17 


Subobjective 
a • b 


Subobjective 
a ^ b 


Subobi( 
a 


active 
b 


i 

National 


3.00 


3.00 


2.50 


1.50 


1.25 


1.00 


Non-white 


10.00 ' 


7.50 


6.50 


3.00 


3.00 


2.25 


Male . • 


2.00 


7.00 


3.00 


2.50 


1.73^ 


1.50 


Female 


4.50 


2.50^ 


2.50 


1.50 


- 1.50 • 


■ 1.25 


Parents Educ. 














< High School 


7.50 


4.50 


4.50 


3.50 


'2.75 


2.25 


Pa reftt« E4ki€-. 














i High School 


3.75 


2.50 


2.00 


1.25 


1.00 


1.00 



V 

4 



i 



•39- 



these three optimal designs are virtually the same despite the fact that 
the three cost models. d:Kffer subaTantially . 




- ■ ,{■ ^ 



ERLC 



"40- 



8. CONCLUSIONS 



f 



In general this study produced results consistent with earlier studies 
and tends to confirm the crfrrent NAEP design. 

Numerical solutions for within-package means tended to produce student 
session sizes which. were much larger than the range covered by the linear 
cost model. It may aJso be noted that reducing the number of packages has 
the effect of reducing the optimal number of PSU's. This may be difficult ^ 
to implement while still retaining the ability to produce reliable data for 
geographical and ty^^^^rf community donAins. 

I The analysis for cros's^package means was based on a fairly small / 
niraber of ing^jis and bear^ repeating on a' larger scale. Ana!ilytical solutions 
for group s.ession s izes** conform more closel^^ those of the present design. 



> 



4 



X 



1.4 



w 



-41- 



REFERENCES 



Chatterjee, S. (1966). A programiping algorithm and its statistical applica- 
tions. -^O.N.R. T^ch. Kept. 1*, Department of * Statistics , ' Harvard 
University*', Cambridge . 

Chromy, J. R. , **Cost Minimization With Multiple Constraints." 'RTI internal 
documentation, 1978. 

Cochran, W.-G., Sampling Techniques , third edition. John Wiley and Sons, 



1977. 



Hartley, H. 0., and Hocking, R. (1963). Convexing prograiming by 
tangential %pprocima tion . Manaagement Science, 9, 600-612^. 

Huddleston, H. F . , * Claypool , P. L. , and Hocking, R. R. (1970). Optimum 
sample allocation to strata using -convex programming. App. 'Stat. 19, 
273-278. 

Shah, B. V.,. R. E. Folsom, C. A. Clayton, Efficiency Study of Year-03 
" ^ In-School Design . RTI final report prepared fof National Assessment 
of Educational Programs, November 1973. . 

Sherdon, A. W. , R. E. Folsoqi, A. F. Clemmfer, A Study of Op ti^mum'lDesigns for 
an In-School Assessment . RTI final report prepared for 'National 
Assessment of Educational Progress, December 1977^^ * 

Zukhovitsky, S. I., and Avdeyeva , L. I. (1966). Linear and Convex Programm- 
ing. W.'B. Slanders, Philadelphia. 



15 



r 



A'42 



APPENDIX A 



COVARIANCE COMPONENTS FOR TWO ESTIMATED TOTALS 



ERIC 



A-43 ■ 
APPENDIX A 

For the derivation of the covariance between two package total 
estimates, consider two totals, say u and v, from two distinct packages. 
Explicitly, 



I 



°2 "3 
^ ^ ^ - '"1 I u..,N../n, (A.l) 



likewise for v. Recall that 

Cov(G,0) = E^[Cov(G,0) ID] + Cov^[£(G|l,2),E(0|l,2) |1] (A. 2) 

and ' ,y . 

Cov(G,7)|l) =*E2[Cov(G,0|l,2)|ll + Cov2[E(u|l,2),E(0|l,2)|l] (A.3) ' 
where the J^umerals 1 and 2 indicate which stage of sampling the expectation 
is being taken over or conditioned on. Note thafei,^_^ 

Cov(G,0|l,2) =0 , ■ (A. 6) 

since non-overlapping simple random student sample^ are selected within 
schools for each package. Thus, 



E2[Cov(G,0|l|2|l] = 0 (A. 5) 



Now consider. 



^ °1 ""2 °3 

E(G|1,2) = 1 [n.P.]"^ I' [n P , J'^ E( I u N /n'|l,2) 
i^l ^ ^ j = l • k=l "-^^ ^ 



°1 ' "2 

1^ [n^Pj-^ [n^P^^.^l-^U^., 



(A. 6) 



where 



N. . 

U. = 1 U. ., 
k=l 



the population total for school-ij 



» . ■ ■ 17 



A-44 



Likewise , 

E(t|1,2) 
Note that, 

5(u|l,2) 



(A. 7) 



1 tn-P...j"' U . 

2 j(i) 1J + 



and 



E(v|l,2) = Z [n^P.]-^ I [n^P (,)]-^ V 
1=1 j = l ' 



(A. 8) 



j^tn^+l 



(A. 9) 



where t is the propclrtion of sample schools where the two packages are 
jointly administered. Also, the summations over schools in equations 
(A. 8) and (A^9) are likewise decomposed. Assuming that the schools where 
both packages are administered constitute a simple random subsample of the 
schools leads to ^ 

Cov^[E(u|l,2),E(v|i;2)|l] 



n^ tn^ . tn^ 

i=l ^ 



'1 



N. 



'where 



A-45 



°1 

i=l i 



1 



and 



j=i 



N. 

' 1 

V. '= I V. 
i++ , i++ 



N. 
1 



Combining (A. 3), (A. 6) and (A. 10) yields 

Cov(G,0|l) = I [njPJ (t/n2)a^^ (2) 

1 

t 

Thus , 

Ej[Cov(G,v|l) = (t/n2)Ej[ I (n^P-)"^ a^^ (2)] 

1=1 1 

! ^' 

1=1 1 



where 



N 



a (2) = I a (2)/P. 

UV . , UV 1 

1=1 1 



Next, note that 



A-46 



/ 



and 



,-4 



E(0|1) = 1 [n P ] " V 
1=1 



Thus, 



CoVj(E(G|l) ,E(0|1)] = Cov^{ 1 [n^P.]'^ ^ ^''l^j'' 

i=l 1=1 



N 

= 1 
i=l 



Combining (A. 2), (A. 13) and^{A.l6) yields 

Cov(G,0) = a^^d)/nj + ta^^(2)/n^n2 
which is the final desired representfetion . 




(A. 16) 



(A. 17) 



UC 



4 



5U 



